Search CORE

11 research outputs found

Study of urinary uric acid and creatinine ratio as a marker of neonatal asphyxia for babies born in a tertiary care hospital

Author: Krishnana Elango
Ponnusamy Venmugil
Sekar Sathiya Priya
Publication venue: 'Medip Academy'
Publication date: 25/11/2017
Field of study

Background: Perinatal asphyxia is a common neonatal problem and there is significant contribution to neonatal morbidity and mortality. It is regarded as an important and common cause of preventable cerebral injury. The prediction of perinatal asphyxial outcome is important but formidable. There is only a limited role for the Apgar score for predicting the immediate outcome, such as HIE and the long-term neurological sequelae observational error can happen in APGAR. But biochemical parameters can be truly relied upon. This study was to evaluate the utility of urinary uric acid to creatinine ratio (UA/CR ratio) as non-invasive, easy, cheap and at the same time early biochemical means of asphyxia diagnosis.Methods: In this prospective case control study conducted in KAPV Government medical college between Feb 2017 to Sept 2017, 100 asphyxiated and 100 non-asphyxiated newborns were included. Detailed history and assessment were for all the enrolled newborns. Spot urine samples were sent for uric acid and creatinine estimation. Results were recorded, and statistical analysis was done.Results: The mean Uric acid/Creatinine ratio in the cases and controls groups were 2.58±1.09 and 0.86±0.17 respectively. The ratio also correlated well with the stage of HIE.Conclusions: The ratio of UA/Cr enables early and rapid recognition of asphyxial injury and also the assessment of its severity and the potential for short term morbidity or death

International Journal of Research in Medical Sciences

Trial of vitamin D supplementation to prevent asthma exacerbation in children

Author: Krishnan Elango
Ponnusamy Venmugil
Sekar Sathiya Priya
Publication venue: 'Medip Academy'
Publication date: 27/05/2017
Field of study

Background: To assess the level of vitamin D in children with bronchial asthma and to study the effects of vitamin D supplementation in asthmatic children who had vitamin D deficiency in terms of asthma control test score and Number of exacerbations.Methods: This interventional study was conducted in Department of Paediatrics, KAPV Government medical college, Trichy, Tamil Nadu, India from September 2016 to February 2017. 96 asthmatic children of age group 5-12 years who attended outpatient department and admitted in ward for asthma exacerbation were selected. After assessing their Vitamin D level, Vitamin D supplementation given along with standard treatment for asthma. Outcomes measured were ACTS (Asthma control test score), number of emergency room visits, number of hospital admissions and reliever medication use.Results: Out of 96 children, 83 (86.4%) children had vitamin D deficiency. There was significant correlation between vitamin D level and absolute eosinophil count (p-value-0.037), asthma severity (p-value<0.001) and asthma control (p-value<0.001). Significant reduction in emergency room visits, (p-value<0.001) reliever medication use (p-value<0.001) and improvement in asthma control test score (p-value-0.008) occurs after vitamin D supplementation.Conclusions: There is a significant correlation between vitamin D level, asthma severity and its control. Asthma exacerbation in terms of emergency room visits and reliever medication use were further reduced by vitamin D supplementation

International Journal of Research in Medical Sciences

On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution

Author: Elango Venmugil
Pouchet Louis-Noël
Ramanujam J.
Rastello Fabrice
Sadayappan P.
Publication venue
Publication date: 01/01/2014
Field of study

Technology trends are making the cost of data movement increasingly dominant, both in terms of energy and time, over the cost of performing arithmetic operations in computer systems. The fundamental ratio of aggregate data movement bandwidth to the total computational power (also referred to the machine balance parameter) in parallel computer systems is decreasing. It is there- fore of considerable importance to characterize the inherent data movement requirements of parallel algorithms, so that the minimal architectural balance parameters required to support it on future systems can be well understood. In this paper, we develop an extension of the well-known red-blue pebble game to develop lower bounds on the data movement complexity for the parallel execution of computational directed acyclic graphs (CDAGs) on parallel systems. We model multi-node multi-core parallel systems, with the total physical memory distributed across the nodes (that are connected through some interconnection network) and in a multi-level shared cache hierarchy for processors within a node. We also develop new techniques for lower bound characterization of non-homogeneous CDAGs. We demonstrate the use of the methodology by analyzing the CDAGs of several numerical algorithms, to develop lower bounds on data movement for their parallel execution

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam J.
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan P.
Publication venue
Publication date: 01/12/2013
Field of study

Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance.Comment: Transaction on Architecture and Code Optimization (2014

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

On the Hardness of Red-Blue Pebble Games

Author: Austrin Per
Carpenter Timothy
Elango Venmugil
Erik
Jia-Wei Hong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/05/2020
Field of study

Red-blue pebble games model the computation cost of a two-level memory hierarchy. We present various hardness results in different red-blue pebbling variants, with a focus on the oneshot model. We first study the relationship between previously introduced red-blue pebble models (base, oneshot, nodel). We also analyze a new variant (compcost) to obtain a more realistic model of computation. We then prove that red-blue pebbling is NP-hard in all of these model variants. Furthermore, we show that in the oneshot model, a

\delta

-approximation algorithm for

\delta<2

is only possible if the unique games conjecture is false. Finally, we show that greedy algorithms are not good candidates for approximation, since they can return significantly worse solutions than the optimum

arXiv.org e-Print Archive

Crossref

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam Jagannathan
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan Ponnuswamy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2013
Field of study

International audienceEmerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance

INRIA a CCSD electronic archive server

Shared Microexponents: A Little Shifting Goes a Long Way

Author: Burger Doug
Chung Eric
Deng Zhaoxia
Elango Venmugil
Golub Maximilian
Hall Mathew
Klar Jasmine
Kolhe Gaurav
L'Heureux Renee
Melnick Levi
Melts Dimitry
Mesmakhosroshahi Maral
More Ankit
Naghshineh Sam
Naumov Maxim
Park Jongsoo
Perry Matt
Rouhani Bita
Shafipour Rasoul
Shao Lei
Varatkar Girish
Zhao Ritchie
Publication venue
Publication date: 15/02/2023
Field of study

This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems

arXiv.org e-Print Archive

Microscaling Data Formats for Deep Learning

Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe

arXiv.org e-Print Archive

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam Jagannathan
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan Ponnuswamy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2013
Field of study

Hal-Diderot

On Using the Roofline Model with Lower Bounds on Data Movement

Author: Elango Venmugil
Pouchet Louis-Noël
Ramanujam Jagannathan
Rastello Fabrice
Sadayappan P.
Sedaghati Naser
Teodorescu Radu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

International audienceThe roofline model is a popular approach to ``bounds and bottleneck''performance analysis. It focuses on the limits to performance of processorsbecause of limited bandwidth to off-chip memory. It models upper bounds on performanceas a function of operationalintensity, the ratio of computational operations per byte of data movedfrom/to memory. While operational intensity can be directly measured for aspecific implementation of an algorithm on a particular targetplatform, it is of interest to obtain broader insights on bottlenecks, where various semanticallyequivalent implementations of an algorithm are considered, alongwith analysis for variations inarchitectural parameters. This is currently very cumbersome and requires performance modeling and analysis of many variants.In this paper, we alleviate this problem by using the roofline modelin conjunction with upper bounds onthe operational intensity of computations as a function of cachecapacity, derived using lower bounds on data movement.This enables bottleneck analysis that holds acrossall dependence-preserving semantically equivalent implementations ofan algorithm. We demonstrate the utility of the approach in in assessing fundamental limits to performance and energy efficiencyfor several benchmark algorithms across a design spaceof architectural variations

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server